Exponentiated Odd Lomax Exponential distribution with application to COVID-19 death cases of Nepal

This study suggested a new four-parameter Exponentiated Odd Lomax Exponential (EOLE) distribution by compounding an exponentiated odd function with Lomax distribution as a generator. The proposed model is unimodal and positively skewed whereas the hazard rate function is monotonically increasing and inverted bathtubs. Some important properties of the new distribution are derived such as quintile function and median; asymptotic properties and mode; moments; mean residual life, mean path time; mean deviation; order statistics; and Bonferroni & Lorenz curve. The value of the parameters is obtained from the maximum likelihood estimation, least-square estimation, and Cramér-Von-Mises methods. Here, a simulation study and two real data sets, “the number of deaths per day due to COVID-19 of the first wave in Nepal" and ‘‘failure stresses (In Gpa) of single carbon fibers of lengths 50 mm", have been applied to validate the different theoretical findings. The finding of an order of COVID-19 deaths in 153 days in Nepal obey the proposed distribution, it has a significantly positive relationship between the predictive test positive rate and the predictive number of deaths per day. Therefore, the intended model is an alternative model for survival data and lifetime data analysis.


Introduction
Probability distributions have been used extensively not only in statistics and mathematics, but also in applied sciences, engineering, and life sciences. Thus, the advancement of probability distributions always continues to grow at a fast pace to simulate real-life conditions and analyze real-life data more efficiently. While doing so, this past decade, many generalized distributions being proposed based on different modification methods with more parameters and flexibility than the existing one. However, there are numerous problems to solve and analyze in real data because any classical or standard probability distributions do not address the different data characteristics [1]. Thus, a new family of distributions or distributions has been proposed to generalize several distributions by compounding well-known distributions which provide greater flexibility in modeling as a practical viewpoint [2]. a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 In the literature, a new parametric distribution has been derived by adding a parameter in exponential and Weibull distribution, yielding a new two-parameter exponential and three parameters Weibull distribution [3]. Marshall-Olkin extended Lomax distribution has been derived by extending the Marshall and Olkin family of distributions based on the Lomax distribution [4]. A five-parameter McDonald Lomax distribution has been derived from the Lomax distribution [5]. Likewise, the new sub-models have been formed by using a Lomax distribution as a generator with two additional positive parameters. In this paper, some special models, such as Lomax-normal, Lomax-Weibull, Lomax-logistic, and Lomax-Pareto distributions have been derived [6,7]. A new distribution has been generalized, then it became the Kumaraswamy-G Poisson distribution, which has three extra positive parameters [8]. The three-parameter power Lomax distribution, which is more flexible than previous Lomax distributions, and it has been derived with decreasing and inverted bathtub hazard rate functions [9]. Moreover, a new two-parameter half Logistic Poisson distribution has been derived, and it expanded into generalized half-logistic Poisson distribution with three parameters. The proposed distribution is increasing, decreasing, upside-down, and bathtub-shaped hazard rate function [10,11]. Similarly, a three-parameter Kumaraswamy half logistic distribution has been derived from the Kumaraswamy-G family by compounding with half logistic distribution as a baseline distribution [2].
Furthermore, exponentiated Weibull Lomax distribution has been derived from the exponentiated Weibull-G family [12]. The alpha power inverted exponential distribution has been derived from the inverted exponential distribution with alpha as a power. The proposed distribution is more versatile in numerous real data analyses [13]. An odd generalized exponential family has been compounding with inverted Lomax distribution in modeling, formed fourparameter model is an odd generalized exponentiated inverse Lomax distribution [14]. Likewise, the odd Lomax-exponential (type III) distribution has been derived from the Lomax random variable as a generator [15]. Lomax exponential distribution has been formed after the new modification of the Lomax distribution which is very flexible in life data modeling with decreasing and increasing hazard shapes (non-monotonic) [16]. Similarly, inverse Lomax as a generator has been used in continuous distributions and formed the inverse Lomax-exponentiated-G family [17]. Moreover, a new Poisson inverted exponential distribution is derived from the Poisson-G family [18]. A three-parameter half logistic Nadarajah-Haghighi extension of exponential (NHE) distribution has been derived by compounding a continuous distribution NHE with half logistic-G family [19], and compounding Rayleigh distribution with exponentiated-G Poisson family by power transformation technique formed exponentiated Rayleigh Poisson distribution [20].
In literature, different distributions have been derived and estimated the parameters by different techniques like as; maximum likelihood estimators, least squares estimators, weighted least squares estimators, percentile estimators, the maximum product spacing estimators, the minimum spacing absolute distance estimators, the minimum spacing absolute log-distance estimators, Cramér von Mises estimators, Anderson Darling estimators, right-tailed Anderson Darling estimators, method of moments estimators and Bayes estimators [21][22][23][24][25].
Corona Virus Disease 2019 (COVID-19) pandemic has devastated the world and is accompanied by economic, social, and behavioral challenges and responses. More than 1.5 million people have died worldwide and more than 1,800 people have died in Nepal by the end of December 2020 [26,27]. Already, several mathematical and statistical models have been proposed to explain the path of the pandemic. However, it is important to note that the characteristics of the data fluctuate which may lead to classical probability distributions that may not be able to be captured in all cases. For example, the data are highly skewed, either to the right or to the left, with the possibility of some outlying observations, and therefore a classical distribution such as the normal distribution cannot be used to fit them. Therefore, flexible distribution is required to capture such data. As a result, we have proposed an Exponentiated Odd Lomax Exponential (EOLE) distribution to analyze the deaths cases of COVID-19 first wave in Nepal. It is more flexible, with four parameters, better equipped to handle complex data, and thus achieves our goal.
In this study, the cumulative distribution function, probability density function, reliability/ survival, hazard rate functions, reverse hazard rate function, and cumulative hazard rate function are explicitly presented in section material and methods. Likewise, we derive some important statistical properties such as quintile function and median, asymptotic properties and mode, moments, mean residual life, mean path time, mean deviation, order statistics, and Bonferroni & Lorenz curve. In an estimation technique, we have to employ three well-known estimation methods to estimate the model parameters namely, the Maximum Likelihood Estimation (MLE), Least-Square Estimation (LSE), and Cramér-Von-Mises (CVM). We conducted a simulation study in the result and discussion section, and two real data sets were used to verify the theoretical findings in various aspects. Finally, derive our conclusion of this study with further discussion.

Exponentiated odd Lomax exponential distribution
Exponential distribution plays a significant role in statistics and probability theory. In this distribution, events occur continuously and independently at a constant average rate. It is a special case of gamma, Weibull, Rayleigh, and Erlang distribution. It is a continuous analog of the geometric distribution, which has the main property of being memoryless. As a result, the exponential distribution is used as a baseline probability distribution having a cumulative distribution function We have, � GðxÞ ¼ 1 À GðxÞ; � GðxÞ ¼ e À ax and GðxÞ � GðxÞ ¼ e ax À 1. The distribution is extended by an auxiliary parameter, it forms an exponentiated function [28,29]. Let, θ > 0 is an auxiliary parameter on odd function, called the exponentiated odd function, which is WðxÞ ¼ GðxÞ Similarly, the T-X family of distribution is an extended form of beta generated distribution by taking any non-negative continuous random variable T as a generator instead of beta random variable [30], which is The r(t) as a generator that has used the probability density function of the Lomax distribution. The Lomax distribution (also known as Pareto type II distribution) is a widespread distribution with applications in the field of actuarial science, reliability modeling, life testing, economics, network analysis, and operations research [31]. Therefore, The PDF of Lomax distribution as a generator is We compound the PDF of the Lomax distribution as a generator and exponentiated odd [W(x)] function because the exponential distribution has a single scale parameter and the Lomax distribution has one of each scale and shape parameter. When both functions are compounded, it becomes two of each scale and shape parameter, making it is more robust and flexible distribution. As a result, it captures different types of data such as; skewed, truncated, nontruncated, and others. Therefore, the CDF of an exponentiated odd Lomax exponential distribution is The corresponding PDF of the proposed distribution is Here, α > 0, δ > 0 are scale parameters and λ > 0, θ > 0 are shape parameters. The shape of PDF (4) is platykurtic and positively skewed at α = 1.0, λ = 1.0 and α = 1.5, λ = 1.5, symmetrical at α = 2.0, λ = 2.0 and it is leptokurtic after increase α and λ whereas θ = 2.5 and δ = 2.0 are fixed [Fig 1, (left panel)].
Likewise, the survival function is complementary to the CDF which gives the chance to live just before during 'x'. Mathematically, R(x) = 1 − F(x). Hence, the survival function of the proposed distribution is The hazard rate function is the conditional density given that the event has not yet occurred before time x. Mathematically, let x be a survival time of a component or item and we want to calculate the probability that it will not survive for an additional time Δx, then hazard rate function is, hðxÞ ¼ lim Dx:RðxÞ FðxÞ ; x > 0: Therefore, the hazard rate function of the proposed model is Likewise, the shape of hazard function (6) a is monotonic increase at (α = 1.0, λ = 1.0), (α = 1.5, λ = 1.5) and (α = 2.0, λ = 2.0). After increasing the value of α and λ then it change monotonic increase and inverted bathtub shaped at (α = 2.5, λ = 2.5) and (α = 3.0, λ = 3.0) whereas θ = 2.5 and δ = 2.0 are fixed [Fig 1, (right panel)].
Similarly, the reversed hazard rate function is the ratio of density to the distribution function which is useful in reliability analysis. It is Likewise, the cumulative hazard rate function is not the probability function, however, it measures the risk. Therefore, it is defined as

Statistical properties
In this section, some properties of the EOLE distribution have been derived.

Useful expansions
Distribution is derived from the generalized binomial series. For, |Z| < 1, n > 0; we have,

Quantile and median
The quantile functions are used in theoretical aspects of a probability distribution. It is an alternative to PDF and CDF, which is used to obtain statistical measures like median, skewness, and kurtosis. It has been also used to generate random numbers. The quantile function is given by Q(u) = F −1 (u). Therefore, the corresponding quantile function of the proposed distribution is Where, u~U(0,1). In particular, the median is derived by setting u ¼ 1 2 in Eq (10), we get;

Asymptotic behavior and mode
To examine the asymptotic behavior, we have to check, lim f ðxÞ. If both limits are converging into zero, then the proposed model satisfied the properties of asymptotic behavior and it existed the mode value. Therefore, Further, we have to calculate the mode by taking the logarithmic in Eq (4), we get; Now, differentiate concerning in Eq (11) and apply the condition f(x) 6 ¼ 0 and f 0 (x) = 0, the mode of proposed distribution is Eq (12) is a nonlinear equation that cannot be solved analytically. It can be solved numerically by using the Newton-Raphson method.

Moments
The moments of probability distribution suggest the characteristics of the distribution like mean, standard deviation, skewness, and kurtosis. Let, X be a random variable following the EOLE distribution, then the moment of the proposed distribution is Alternatively, we define the moments of proposed distribution from the quantile function [32,33]. The r th raw moment of the proposed distribution is Where, Q G (u) is the quantile function (10), then Eq (14) is By simplification, we get r th raw moments of proposed distribution is [33,34].
In particular, the first four moments of X obtained by substituting the value of r = 1, 2, 3 and 4 in Eq (16).

Conditional moments
The conditional moment is also of interesting for increasing the failure rate model. Conditional moment is Alternatively, we can define the conditional moments from the quantile function, which is Where, u = F(x) is CDF and, R(x) is survival function of the proposed model, then conditional moments is In particular,

Mean residual life
The Mean Residual Life (MRL) is the average outstanding life, X − x given that the item has survived to time x. Thus, the expected additional lifetime given that a component has survived until the time x is called the MRL. It is defined as, Alternatively, we can define the MRL of proposed distribution from the quantile function is Where, F(x) is CDF and, R(x) is survival function of the proposed distribution.

Mean past lifetime
The mean Past Lifetime (MPL) is the conditional random variable x − X/X � x. This showed that the time elapsed from the failure of the component given that its lifetime is less or equal to x. It can be calculated as, It can be alternatively defined from the quantile function, which is

Mean deviation
The Mean Deviation (MD) from mean and median measures the scatter from the center value either mean or median. The MD is defined as, We obtained MD(μ) and MD(m d ) using the following relationships: Likewise, We have to calculate Z 1 m xf ðxÞdx in terms of quantile function such as Finally, the Eqs (24) and (25) becomes, and

Order statistics
Order statistics have been extensively applied in many fields of statistics such as reliability and life testing. Let, X 1 , X 2 , . . ., X n random sample from (4) and X 1:n � X 2:n � . . . � X n:n corresponding order statistics. The probability density function of r th order statistics say X r:n ; 1 � r � n [33] is given by; We apply the preposition of (1) and (2) in Eq (26) then the equation becomes, When, r = n then from Eq (27), the pdf of the largest order statistics X n:n is given by f n:n ðxÞ ¼ n x ðnÞ > 0: Similarly, r = 1, then from Eq (27), the pdf of smallest order statistics x 1:n is given by " # nÀ 1 ; x ð1Þ > 0:

Bonferroni and Lorenz curve
Bonferroni and Lorenz curve has been proposed by Bonferroni [33]. To measure poverty and income, Bonferroni and Lorenz curves are widely used. Also, such types of curves are widely used in other fields like demography, medicine, reliability, insurance and many others.

Methods of estimation
We have to estimate the value of unknown parameters of the proposed model by maximum likelihood estimation, method of least square, weighted least square, and Cramér von miss technique.

Maximum Likelihood Estimation (MLE)
Let, x 1 , x 2 , . . ., x n are random sample from EOLE distribution with parameters (α, θ, λ and δ), then likelihood function of proposed distribution is the product of n th time of sample PDF Where, F is the parameter space which belongs to (α, θ, λ and δ). Therefore, the log-likelihood function of the proposed distribution is The parameters are obtained from maximum likelihood estimation by partial differentiate (28) with respect to corresponding parameters. Let, x i ¼ e ax i and u i ¼ e ax i À 1 we have; Finally, solve non-linear equations @lnð'Þ @a ¼ 0, @lnð'Þ @b ¼ 0, @lnð'Þ @l ¼ 0 and @lnð'Þ @d ¼ 0 for α, θ, λ and δ. We get the maximum likelihood estimate value (â,ŷ,l andd) of the parameters (α, θ, λ and δ). Likewise, for the interval estimation of parameters (α, θ, λ and δ), we have to calculate the observed information matrix. The observed information matrix is Furthermore, the asymptotic normality of MLEs, approximate 100(1 − γ)% confidence intervals of α, θ, λ and δ can be constructed as; a � z g=2 SEðâÞ;ŷ � z l=2 SEðŷÞl � z g=2 SEðlÞ andd � z g=2 SEðdÞ where z γ/2 is the upper percentile of standard normal variate.

Method of Least-Square Estimation (LSE)
Initially, the least square estimation and weighted least square estimate were introduced to estimate the parameters of beta distribution [35,36]. This technique has been used to estimate unknown parameters of proposed distribution by minimizing the concerning parameters α, θ, λ and δ, which is The parameter's values are obtained from the least square method by partial differentiation in Eq (35) concerning corresponding parameters.
Let, x k ¼ e ax k and u k ¼ e ax k À 1, and t k ¼ 1 þ 1 d e ax i À 1 ð Þ y , then Eq (35) becomes; x k x k u yÀ 1 k t À ðlþ1Þ x k u y k lnðu k Þt À ðlþ1Þ We solve non-linear equations @U @a ¼ 0, @U @y ¼ 0, @U @l ¼ 0 and @U @d ¼ 0 to estimate the unknown parameters of the proposed distribution by minimizing the function concerning parameters α, θ, λ and δ.

Weighted least-square estimation
The weighted least-squares estimation is a technique to determine the unknown parameters by minimizing concerning parameters α, θ, λ and δ is W X; a; y; l; d ð Þ ¼ Where, w k ¼ 1 Þ is the weight for the proposed model. Hence, the weighted least-square estimators of α, θ, λ and δ respectively can be obtained by partial differentiate with respect to corresponding parameters in Eq (40) and set the result equal to zero We solve non-linear equations @W @a ¼ 0, @W @y ¼ 0, @W @l ¼ 0 and @W @d ¼ 0 to estimate unknown parameters of proposed distribution by minimizing function concerning parameters α, θ, λ and δ.

PLOS ONE
Cramér-Von-Mises estimators of α, θ, λ and δ respectively can be obtained by partial differentiate with respect to corresponding parameters in Eq (45) and set the result equal to zero x k x k u yÀ 1 k t À ðlþ1Þ x k u y k lnðu k Þt À ðlþ1Þ We solve non-linear equations @C @a ¼ 0, @C @y ¼ 0, @C @l ¼ 0 and @C @d ¼ 0 to estimate unknown parameters of proposed distribution by minimizing the function concerning parameters α, θ, λ and δ.

Results and discussion
Data analysis has been done in two-phase. Firstly, we have done a simulation study and secondly, we have done real data analysis. In real data analysis, two data sets have been used to validate the proposed model: (i) The first data set is the number of deaths per day due to the COVID-19 first wave in Nepal. (ii) The second data set is failure stresses (in GPa) of single carbon fibers of lengths 50 mm.

Simulation study
In a simulation study, we estimate the parameters of the proposed distribution by maximum likelihood estimation. The performance of ML estimators is assessed through their average bias and Mean Square Error (MSEs) for different sample sizes. For the estimation purpose, 10000 random samples of sizes 50, 200, 500, 750 are generated with different combinations of (α, θ, λ and δ). The iterative technique is used to estimate the ML parameters of each sample size. We observed that average bias and MSEs for individual parameters fall to zero when sample size increases as our expectation, which provides the consistency of the estimators. (Table 1).

Real data analysis I. Number of deaths per day due to COVID-19 in Nepal.
The COVID-19 is a worldwide pandemic of coronavirus disease in 2019 including Nepal. The first COVID case was confirmed on 23 January 2020 and the first death was on 14 May in Nepal. Due to the COVID-19 pandemic, the government has emphasized a nationwide lockdown from March 24, 2020, to July 21, 2020. Following that, the government concentrated its efforts on the PCR test and other health-related initiatives. Every day, the ministry of health and population have been provided the data regarding COVID-19 issues, such as test positive rate, the number of deaths, the number of infected, and many others. During the research period, researchers collected the data daily from 23 January 2019 to 24 December 2019 all over the country. Every day, the ministry of health and the population of Nepal (MOHP) has been reported the data [26].
Among these data, we select the number of deaths to validate the proposed model. A total of 1,808 deaths were recorded in Nepal at the end of 24 December 2020 due to COVID-19 first wave. Every day, on average, 5.4 � 6 people were died due to COVID-19 (from 23 January to 24 December). The summary finding of daily deaths has been presented in the following table (Table 2).

Exponentiated Odd Lomax Exponential (EOLE) distribution
To validate the proposed model, at least two deaths occurred every day as consideration for sample data. In the last 153 days, every day, at least two people have died, as reported below [26].

Total time test plot
TTT plot is an important graphical method for checking whether or not our data set can be applied in a particular model. Plots can be easily obtained by using the TTT function of adequacy model package on R software. It is used to validate the hazard rate function [37,39]. The empirical version of the TTT plot is where, y r:n (r = 1, 2, . . ., n) and y i:n (i = 1, 2, . . ., n) are the order statistics of the sample. The shape of the TTT plot is either convex for decreasing failure rate, concave for increasing failure rate or bathtub shaped. Here, the TTT plot of the illustrative data set is concave for increasing failure rate. It indicates that the data set is valid for further analysis [Fig 2 (left panel)] [37].

Box plot
The summary finding of the data set is present by using the box plot. It provides a clear picture of the descriptive characteristics of the illustrative data set [Fig 2 (right panel)].

Parameter estimation
We computed the value of the parameter by maximizing the log-likelihood function in Eq (28), minimizing the least square method in Eq (35), weighted least square Eq (40), and the Cramér Von Mises method in Eq (45) directly by using maxLik () function on R software [38,39]. Finally, we have to present the estimated value ofâ;ŷ;l andd � � ; which were computed by different methods (Table 3).

Distribution characteristics
After estimating the value of the parameter, we determined the characteristics of the proposed distribution from the illustrative data set. The finding of descriptive statistics showed that the mean is greater than the median, which is also higher than the mode, and value of skewness is positive, which shows that the proposed model is positively skewed. In the case of kurtosis, the distribution is approximately symmetrical, but towards platykurtic ( Table 4).  (Table 5). Furthermore, we compared the empirical distribution and theoretical cumulative distribution of the proposed model, indicating that the curve of empirical distribution is closer with the finding of MLE but does not closer with other findings (LSE, WLSE, and CVM) in the illustrative data set [Fig 3 (left panel)]. Also, we plot the theoretical PDF of the intended model by using different estimated values [Fig 3 (right panel)]. In both graphical demonstrations, the estimated value of MLE is more appropriately fitted than others. Furthermore, we have to predict the probability of test positive rate and probability of number of deaths per day. Finally, we have to determine the relationship among these variables. The finding revealed that, there is a positive relationship among these variables, which is statistically significant (r = 0.2762, p-value = 0.00054) with a 95% confidence interval (0.12291-0.41662).

Validation of estimation methods
The finding concludes that the test positive rate will increase; the death rate should be increased [Fig 4].

Model comparisons/selections
Model selection is an important and integral part of data analysis. It is important to increase computing power to fit more realistic, flexible, and complex models. We compared our proposed model with eleven competitive models namely; exponentiated half logistic exponential (EHLE) [40], Marshall-Olkin logistic exponential (MOLE) [41], Lomax exponential Weibull (LEW) [42], exponentiated generalized inverted exponential (EGIE) [43], generalized inverted generalized exponential (GIGE) [44], generalized odd inverted exponential exponential  [46], odd Lomax exponential (OLE) [47], type I half-logistic Fréchet (TIHLF) [48], Lindley inverse Weibull (LIW) [36] and half logistic Nadarajah Haghighi extension of exponential (HLNHE) [19]. To compare the proposed models with other competitive models, firstly we determine the value of parameters by maxlik function () from R software by solving the nonlinear equation [38,39]. The estimated parameter value of each distribution along with standard error are present in the following table (Table 6). The PDF of each competitive model is in Appendix C of S1 Appendix.
According to -2LL, AIC, BIC, CIAC and HQIC, the least value among the competitive models is superior to others. The finding reveals that the value of the intended model has smaller as compared to all other eleven competitive models. Therefore, the proposed model is superior than others followed by MOPGW. The model GIGE is the least fitted model in the given illustrative data set (Table 7).
Furthermore, we have compared the empirical distribution and theoretical cumulative distribution of the proposed model, indicating that both curves are closer in the illustrative data set. Likewise, the theoretical CDF of nine competitive models namely, EHLE, MOLE, LEW, EGIE, GIGE, MOPGW, OLE, LIW, and HLNHE compared to the theoretical CDF the proposed model [ Fig 5 (left panel)]. Also, the theoretical PDF of the intended model is compared with all other competitive models [ Fig 5 (right panel)]. The finding suggests that the proposed model is adequately fit in illustrative data set than all other competitive models.
II. Failure stresses (In Gpa) of single carbon fibers of lengths 50 mm data set. The second data set "on failure stresses (in GPa) of single carbon fibers of lengths 50 mm" [49] has  (Table 8).
The lowest value of -2LL, AIC, BIC, CIAC, and HQIC in the proposed model, among all competitive models, indicates that the proposed model is superior to others (Table 9).
Similarly, the built model is appropriately fit in terms of graphical appearance than other competitive models [Fig 6].

Conclusion
This study suggested a new four-parameter Exponentiated Odd Lomax Exponential (EOLE) distribution by compounding an exponentiated odd function with Lomax distribution as a generator. Some important properties of the new distribution are investigated such as quintile function and median; asymptotic properties and mode; moments; mean residual life, mean path time; mean deviation; order statistics; and Bonferroni & Lorenz curve. Further, we have employed three well-known estimation methods to estimate the model parameters namely, the maximum likelihood estimation, least-square estimation, and Cramér-Von-Mises methods.  To verified the different theoretical finding we have applied a simulation study and two real data sets, ''Number of deaths per day due to COVID-19 first wave in Nepal" and ''failure stresses (in GPa) of single carbon fibers of lengths 50 mm". It has a significantly positive relationship between predicted test positive rate and the predicted number of deaths per day. Finally, we analyzed the illustrative data set and found that the proposed model provides a reasonably better fit as compared to some other well-known models. Therefore, the EOLE distribution can be used as an alternative model in the future to analyze survival and lifetime data.